Name | Version | Summary | date |
mteb |
1.25.8 |
Massive Text Embedding Benchmark |
2024-12-30 07:13:14 |
opencompass |
0.3.8 |
A comprehensive toolkit for large model evaluation |
2024-12-27 05:51:20 |
fusion-bench |
0.2.7 |
A Comprehensive Benchmark of Deep Model Fusion |
2024-12-20 06:06:22 |
pydftracer |
1.0.8 |
I/O profiler for deep learning python apps. Specifically for dlio_benchmark. |
2024-12-17 03:37:11 |
rdt |
1.13.2 |
Reversible Data Transforms |
2024-12-16 22:46:10 |
qpbenchmark |
2.4.0 |
Benchmark for quadratic programming solvers available in Python. |
2024-12-16 09:24:00 |
ms-opencompass |
0.1.5 |
A lightweight toolkit for evaluating LLMs based on OpenCompass. |
2024-12-16 08:05:22 |
mlrb-agent-tasks |
0.0.23 |
A task package for ML Research Bench |
2024-12-10 16:21:44 |
swebench |
2.1.7 |
The official SWE-bench package - a benchmark for evaluating LMs on software engineering |
2024-12-10 07:51:54 |
EpiLog |
1.1.2 |
Simple No-Frills Logging Manager |
2024-12-06 21:32:56 |
nodespecs |
0.1.1 |
The specs summarize utilities for computer instance |
2024-12-06 15:43:45 |
Younger |
0.0.1a2 |
A Younger Project for Artificial Intelligence: Datasets, Benchmarks, and Applications. |
2024-11-25 08:01:45 |
syntherela |
0.0.3 |
SyntheRela - Synthetic Relational Data Generation Benchmark |
2024-11-21 06:20:37 |
cmdbench |
0.1.22 |
Quick and easy benchmarking for any command's CPU, memory, disk usage and runtime. |
2024-11-20 06:53:26 |
folktexts |
0.0.24 |
Use LLMs to get classification risk scores on tabular tasks. |
2024-11-18 13:40:19 |
construe |
0.2.0 |
An LLM inferencing benchmark tool focusing on device-specific latency and memory usage |
2024-11-13 03:52:24 |
guardbench |
1.0.0 |
GuardBench: A Large-Scale Benchmark for Guardrail Models |
2024-11-12 02:44:56 |
buildings-bench |
2.0.0 |
Large-scale pretraining and benchmarking for short-term load forecasting. |
2024-11-07 23:02:04 |
BD-Dev-Benchmark |
1.0.0 |
A package for converting DBLP data into JSON format with varying capacities for benchmarking software performance. |
2024-11-02 14:02:35 |
pytest-benchmark |
5.1.0 |
A ``pytest`` fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer. |
2024-10-30 11:51:48 |